首页> 外文OA文献 >The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection
【2h】

The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection

机译:ImageNet随机播放:重新组织的视频事件检测预训练

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This paper strives for video event detection using a representation learned from deep convolutional neural networks. Different from the leading approaches, who all learn from the 1,000 classes defined in the ImageNet Large Scale Visual Recognition Challenge, we investigate how to leverage the complete ImageNet hierarchy for pre-training deep networks. To deal with the problems of over-specific classes and classes with few images, we introduce a bottom-up and top-down approach for reorganization of the ImageNet hierarchy based on all its 21,814 classes and more than 14 million images. Experiments on the TRECVID Multimedia Event Detection 2013 and 2015 datasets show that video representations derived from the layers of a deep neural network pre-trained with our reorganized hierarchy i) improves over standard pre-training, ii) is complementary among different reorganizations, iii) maintains the benefits of fusion with other modalities, and v) leads to state-of-the-art event detection results. The reorganized hierarchies and their derived Caffe models are publicly available at http://tinyurl.com/imagenetshuffle.
机译:本文力图使用从深度卷积神经网络中学到的表示来进行视频事件检测。与领先的方法不同,他们都从ImageNet大规模视觉识别挑战赛中定义的1,000个课程中学习,我们研究了如何利用完整的ImageNet层次结构对深度网络进行预训练。为了解决过分特定的类和图像很少的类的问题,我们引入了自底向上和自上而下的方法来重组ImageNet层次结构,该方法基于ImageNet的所有21,814个类和1400万个图像。 TRECVID多媒体事件检测2013和2015数据集的实验表明,从经过我们重组后的层次结构进行预训练的深度神经网络的层获得的视频表示形式i)比标准的预训练有所改进; ii)在不同的组织结构之间具有互补性; iii)保留了与其他模式融合的好处,并且v)导致了最新的事件检测结果。重组后的层次结构及其派生的Caffe模型可从http://tinyurl.com/imagenetshuffle上公开获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号